Gated spatio and temporal convolutional neural network for activity recognition: towards gated multimodal deep learning

نویسندگان

  • Novanto Yudistira
  • Takio Kurita
چکیده

Human activity recognition requires both visual and temporal cues, making it challenging to integrate these important modalities. The usual schemes for integration are averaging and fixing the weights of both features for all samples. However, how much weight is needed for each sample and modality, is still an open question. A mixture of experts via a gating Convolutional Neural Network (CNN) is one promising architecture for adaptively weighting every sample within a dataset. In this paper, rather than just averaging or using fixed weights, we investigate how a natural associative cortex such as a network integrates expert networks to form a gating CNN scheme. Starting from Red Green Blue color model (RGB) values and optical flows, we show that with proper treatment, the gating CNN scheme works well, indicating future approaches to information integration in future activity recognition.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatio-temporal Graph Convolutional Neural Network: A Deep Learning Framework for Traffic Forecasting

The goal of traffic forecasting is to predict the future vital indicators (such as speed, volume and density) of the local traffic network in reasonable response time. Due to the dynamics and complexity of traffic network flow, typical simulation experiments and classic statistical methods cannot satisfy the requirements of mid-and-long term forecasting. In this work, we propose a novel deep le...

متن کامل

معرفی شبکه های عصبی پیمانه ای عمیق با ساختار فضایی-زمانی دوگانه جهت بهبود بازشناسی گفتار پیوسته فارسی

In this article, growable deep modular neural networks for continuous speech recognition are introduced. These networks can be grown to implement the spatio-temporal information of the frame sequences at their input layer as well as their labels at the output layer at the same time. The trained neural network with such double spatio-temporal association structure can learn the phonetic sequence...

متن کامل

EMG-based wrist gesture recognition using a convolutional neural network

Background: Deep learning has revolutionized artificial intelligence and has transformed many fields. It allows processing high-dimensional data (such as signals or images) without the need for feature engineering. The aim of this research is to develop a deep learning-based system to decode motor intent from electromyogram (EMG) signals. Methods: A myoelectric system based on convolutional ne...

متن کامل

Acoustic Modeling Using Bidirectional Gated Recurrent Convolutional Units

Convolutional and bidirectional recurrent neural networks have achieved considerable performance gains as acoustic models in automatic speech recognition in recent years. Latest architectures unify long short-term memory, gated recurrent unit and convolutional neural networks by stacking these different neural network types on each other, and providing short and long-term features to different ...

متن کامل

A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning

Traffic flow forecasting has been regarded as a key problem of intelligent transport systems. In this work, we propose a hybrid multimodal deep learning method for short-term traffic flow forecasting, which jointly learns the spatial-temporal correlation features and interdependence of multi-modality traffic data by multimodal deep learning architecture. According to the highly nonlinear charac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • EURASIP J. Image and Video Processing

دوره 2017  شماره 

صفحات  -

تاریخ انتشار 2017